using Gonnet
RBD analysis report
Introduction
This is the WNP report for RBD. The report is interactive: click around on the plots and use the download buttons to export the tables/sequences for further analysis!
Want to know more about how the results were generated? Check the bottom of the report 👇
Top 100 most abundant sequences in each round
The trees in this section visualise the 100 most abundant clones in each round. Abundance is measured in counts per million (CPM) and so can be compared between rounds.
Round 1
Round 2
using Gonnet
Enriched clusters
PCA plot & top 100s
Tree of the top 100 most enriched clusters
using Gonnet
Analysis method
Raw sequencing reads are first trimmed (to remove sequencing adapters) and merged (as two paired reads are required to cover the entire nanobody sequence) using the software TrimGalore and FLASH, respectively. Then, quality control of the run is performed with multiQC. To identify the important parts of the nanobody sequence (for example the germline genes and CDRs), IgBLAST is used with a custom alpaca reference (created using VDJ sequences deposited in the IMGT database).
Before analysis, reads that likely represent sequencing errors: those containing frameshifts, stop codons or having very low counts per million (CPM) are removed. Filtered nanobody sequences are then tracked through the panning process to determine their enrichment (measured as log2 fold change from Round 0, before panning to the end of the panning process, usually Round 2). Enriched nanobodies are then clustered in order to remove redundancy, and if there are a large number of clusters, a top 100 is chosen.
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#fbf0ed',
'primaryTextColor': '#e83e8c',
'primaryBorderColor': '#e83e8c',
'lineColor': '#FF784F',
'secondaryColor': '#006100',
'tertiaryColor': '#fff'
}
}
}%%
flowchart TD
import["Raw sequencing read pairs"] --> trim_merge["Trim and merge raw reads"]
trim_merge --> seqQC["Quality control of sequencing run"]
seqQC --> annotate["Annotate nanobody germline genes, CDRs"]
annotate --> filtering["Filter to remove sequencing errors"]
filtering --> enrichment["Determine enrichment (log2 fold change) across panning rounds"]
enrichment --> clustering["Group similar nanobodies into clusters"]
clustering --> top_100["If required, narrow enriched clusters down to a top 100"]